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Abstract 



Recently, Arikan introduced the method of channel polarization on which one can construct efficient 
capacity-achieving codes, called polar codes, for any binary discrete memoryless channel. In the thesis, we 
show that decoding algorithm of polar codes, called successive cancellation decoding, can be regarded as 
belief propagation decoding, which has been used for decoding of low-density parity-check codes, on a tree 
graph. On the basis of the observation, we show an efficient construction method of polar codes using density 
evolution, which has been used for evaluation of the error probability of belief propagation decoding on a 
tree graph. We further show that channel polarization phenomenon and polar codes can be generalized to 
non-binary discrete memoryless channels. Asymptotic performances of non-binary polar codes, which use 
non-binary matrices called the Reed-Solomon matrices, are better than asymptotic performances of the best 
explicitly known binary polar code. We also find that the Reed-Solomon matrices are considered to be natural 
generalization of the original binary channel polarization introduced by Arikan. 
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CHAPTER 1 



Introduction 

1.1. Overview 

The channel coding problem, in which one attempts to realize reliable communication on an unreliable 
channel, is one of the most central problems of information theory. Although it had been considered that 
unlimited amounts of redundancy is needed for reliable communication. Shannon showed that in large sys- 
tems, one only has to pay limited amounts of redundancy for reliable communication fl7|. Shannon's result 
is referred to as "the channel coding theorem". Although the theorem shows the existence of a good channel 
code, we have to explicitly find desired codes and practical encoding and decoding algorithms, in order to 
realize reliable efficient communication. 

1.2. Channel Model and Channel Coding Problem 

1.2.1. Channel model. Let X and y denote sets of input and output alphabets. Assume that X is finite 
and that y is at most countable. A discrete memoryless channel W is defined as conditional probability 
distributions W{y\x) of y ^y for aWx^ X which represent probability that a channel output is y when x is 
transmitted. 

1.2.2. Channel coding problem. Let and M denote a set of messages and its cardinality, respec- 
tively. When M — \X\, we can make a one-to-one correspondence between M and X. Let us consider 
communication where a sender transmits x ^ X which represents a corresponding message m e and a 
receiver estimates m (equivalently x) from received alphabet y. Let 1/(37) e M denote an estimation given y. 
In this communication, an error probability of a channel W is 

Y^W{y\x)l{xif{y)^x} 

™ xexyey 

where I is the indicator function. 

When an error probability of W is larger than desired even if an estimator is optimal, we have to 
consider using a channel W multiple times in order to improve reliability of communication. Let Zq^^ denote 
a vector (zq, . . . and z\ denote subvector (z;, . . . of z". If one sends xj;"' e X" by using a channel 
W n times, we assume that the transition probability is W"-{yl''^ \ Xq"') n"Jo '^{yi I ^i) for all Jq"' G y . 
This property of channel is referred to as memoryless. 

Mappings : 7W — > X" and 1/ : 3^" — ^ denote encoder and decoder, respectively for some « e N 
called the blocklength. An image of and its elements are called code and codewords, respectively. An error 
probability of a code is defined as 

1^ E L W''{yl-'\<t>{a))lW{yl-')^a}. 

In order to measure efficiency of communication, coding rate, defined as logM/«, is considered. Shan- 
non and other researchers showed that there exists the asymptotically best trade-off between coding rate and 
error probability of code. 

Theorem 1.1 (Channel coding theorem). There exists a quantity CiW) G (0,1), called capacity of a channel 
W, which has the following properties. 

There exists sequences of encoders 0; ; Al,- — > X"' and decoders : Y"' — )■ Mi such that error probabil- 
ities tend to and limit superior q/log |A^,|/n,- is smaller than CiW). 
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Conversely, for any sequences of encoders ; Al,- — > X"' and decoders : Y"' — >■ Mi, where limit 
inferior o/log \M.i\/ni is larger than CiW), error probabilities tend to 1. 

The channel coding theorem only shows existence of sequences of encoders and decoders on which 
reliable efficient communication is possible. One of the goals of coding theory is to find practical encoders 
and decoders which achieve the best trade-off described in the channel coding theorem. 



1.3. Preview of Polar Codes 

Polar codes, introduced by Arikan [2 |, are the first provably capacity achieving codes for any symmetric 
binary-input discrete memoryless channels (B-DMC) which have low complexity encoding and decoding 
algorithms. Complexities of encoding and decoding are both 0{N\ogN) where is the blocklength. Polar 
codes are based on channel polarization phenomenon. 

Arikan and Telatar showed that asymptotic error probability of polar codes whose coding rate is smaller 
than capacity is o {2-'^^ ) for any j3 < 1 /2 and co{2-'^^ ) for any )3 > 1 /2 |3 1. Since error probabilities of the 
best codes decay exponentially in the blocklength ||6l, polar codes are not optimal in the asymptotic region. 
In the original work of Arikan, generator matrices of polar codes are constructed by choosing rows of 
"1 Ol 



G®", where G = ^ and where ®" denotes the Kronecker power On the other hand, Korada, §a§oglu, 

and Urbanke generalized polar codes which are constructed from larger matrices instead of G f9l. Further, 
they showed that asymptotic performance of polar codes is improved by using larger matrices. 

Korada and Urbanke showed that polar codes also achieve symmetric rate-distortion trade-off as lossy 
source codes ilOl . They also showed that polar codes achieve optimal rate of Wyner-Ziv and Gelfand-Pinsker 
problems. 



1.4. Contribution of the Thesis 

1.4.1. Construction of polar codes. In Arikan's original work, complexity of construction of polar 
codes grows exponentially in the blocklength. We show a novel construction method whose complexity is 
linear in the blocklength. The construction method is based on density evolution, which has been used for 
calculation of the large blocklength limit of the bit error probability of low-density parity-check (LDPC) 
codes lUSl . 

1.4.2. Generalization of polar codes. Non-binary polar codes are considered. When a set of input 
alphabets is a finite field, we obtain sufficient conditions for a matrix on which capacity-achieving polar 
codes can be constructed for any DMC. We also consider polar codes constructed from a non-linear mapping 
instead of a linear mapping. 



1.5. Organization of the Thesis 

In Chapter|2] channel polarization phenomenon for B-DMC, introduced by Arikan ll), is considered. In 
Chapter[3] the speed of channel polarization, shown by Arikan and Telatar fS], is considered. In ChapterH] we 
define polar codes which are based on the channel polarization phenomenon [Tl . It is shown that complexities 
of encoding and decoding are 0{NlogN) where is the blocklength. We show a novel construction method 
whose complexity is linear in the blocklength lUll for symmetric B-DMC. In Chapter|5] channel polarization 
of ^-ary channels is considered. Sufficient conditions for channel polarization matrices and a simple example 
are shown. 



1.6. Notations and Useful Facts 

In the thesis, we use the following notations. Let x^^^ and x/ denote a row vector {xq,... ,jc„-i) and its 
subvector (x,-, . . . ,Xj). For A = (ao, . . . ,flm-i) C {0, . . . ,n — 1}, denotes a subvector (x^q, . . . ,Xa,„_, )■ Let 
J^' denote the complement of a set and \T\ denote cardinality of Let Gij denote {i,j) element of a 
matrix G. 
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Let X, Y and Z be random variables on a probability space (£2, P) ranging on discrete sets A, B and 
C, respectively. The mutual information between X and Y is defined as 

Similarly, the mutual information between X and (F,Z) is defined as 



I{X;YZ) : = 



^ P{X=x,Y ^y,Z^z)\og 



P{X^x,Y ^y.Z^z) 



.eA,yeB.zeC ' P(X ^ x)P{Y = y,Z = z)' 

The conditional mutual information between X and Y given Z is defined as 

, , V- , X P(X =x,Y =y\Z = z) 
I{X;Y Z):^ £ P(X ^ x,Y = y,Z = z)\og-—^-—^—f^ ^ -. 

The most fundamental fact in the thesis, called the chain rule for mutual information, is the following. 
Proposition 1.2. IS) 

I{X\YZ) ^ I{X;Y) +I{X\Z\Y) 
The cutoff rate of {X,Y) is defined as 



Rq{X-Y) := -\ogY^ 

vGB 



Y^P{X^x)^P{Y^y\X^x) 

xeA 



Similarly, the conditional cutoff rate of {X,Y) given Z is defined as 



7;o(^;i'|Z):=-log £ PiZ = z) 

yeB.zeC 



-\ 2 



Y^PiX=x\Z = z)VPiY=y\X=x,Z = z) 



In the thesis, the cutoff rate is used for bounding the mutual information by the following proposition. 
Proposition 1.3. IH 

I{X;Y)>Ro{X;Y) 
I{X;Y\Z)>Ro{X-Y\Z) 
Proof. The second inequality is an immediate consequence of the first inequality. 

V- . X P(X^x,Y^y) 



-2 £ P{X^x,Y^y)loi 



I P{X = x)P{Y ^ y) 
P{X^xJ^y) 



>-2Y,PiY^y)log'£PiX^x\Y=y). 

vGB x£A 



yeB 

> -logj^ P{Y^y) 
yeB 



l^P{X=x\Y^y)^ 

xeA 

l^P{X=x\Y=y)^ 

xeA 



P{X = 


x)P{Y 




P{X 


= x,Y = 


-y) 


P{X^ 


:x)p(y 


^y) 


P{X 


= x,Y = 


= . 


'p(x = 


x)P{Y 


= y) 


P{X 


= x,Y = 


-y) _ 



logE 

yeB 



Y^P{X^x)^P{Y^y\X^x) 

xeA 



■Rq{x-y) 



The above inequalities are obtained from Jensen's inequality. 



□ 



3 



CHAPTER 2 



Channel Polarization of B-DMCs by Linear Kernel 



2.1. Introduction 



1 

1 1 



0. 



Arikan introduced polar codes whose generator matrix is constructed by choosing rows from 

Korada, §a§oglu, and Urbanke generalized the result for an arbitrary full-rank matrix |l9|. Arikan explained 
that polar codes are constructed on channel polarization phenomenon. This explanation is useful for under- 
standing polar codes. In this chapter, we consider the channel polarization phenomenon of B-DMC induced 
by an arbitrary linear mapping. 



2.2. Preliminaries 

Let X and y be sets of input alphabets and output alphabets. In the thesis, we assume that A" is a finite 
set and y is at most a countable set. A DMC is defined as a conditional probability distribution W (y | x) over 
y for all X G X. We write W : X ^ y to mean a DMC with a set of input alphabets X and a set of output 
alphabets y. In this chapter, we deal with B-DMC, i.e., X = {0, 1} and assume that the base of logarithm is 
2. 

Definition 2.1. The symmetric capacity of a B-DMC W : X ^ y is defined as 



/(W):=EEiw(,|.)log,— 



xeXyTy^ °iW(y|0) + iW(y| 1) 

Note thatI{W) £ [0,1]. 

Definition 2.2. The Bhattacharyya parameter of a B-DMC W is defined as 



z{w) £ y^wiy^mW)- 

yey 

Note that Z{W) E [0,1]. 

Lemma 2.3. L2J The symmetric capacity and the Bhattacharyya parameter satisfy the following relations. 

/(W)+Z(W) > 1 

i{w f+z{w f < 1 

2.3. Channel Polarization 

We consider recursive channel transform using a full-rank square matrix G on F2. In ||2l, Arikan chose 



(2.1) 



1 
1 1 



In this chapter, following Korada, §a§oglu, and Urbanke we assume that G is an arbitrary full-rank square 
matrix. Let £ be the size of G. Channel transform procedure is defined as follows. 
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Definition 2.4. 

w'iyt'\4-'):Jl\Wiyi\xt) 

1=0 

In the above definition, W^''' is called a subchannel of W. Let t/g^^ ' ^ denote random variables 

taking values on X'^\ and y^, respectively, and obeying distribution 

where V is an ^ x £ full-rank upper triangle matrix. Since there exists a one-to-one correspondence between 
Uq and Xq for all / G {0, 1 }, statistical properties of w''^ are invariant under an operation G — > VG. 
Further, a permutation of columns of G does not change statistical properties of W^'\ Since any full-rank 
matrix can be decomposed as VLP where V, L, and P are upper triangle, lower triangle, and permutation 
matrices, respectively, without loss of generality we assume that G is a lower triangle matrix. 

Assume that is a sequence of independent uniform random variables taking values on {0, . . . , £ — 

1}. Let /„ /(w'^i) f^"^). Channel polarization phenomenon is described in the following theorem. 

Theorem 2.5. ||2l, 121 IfG is not diagonal, I„ /oo almost surely, where loo satisfies 

^ fo, with probability 1 - /(W) 
ll, with probability /(W). 

Theorem l2.5l savs that £" subchannels {W'^'^ ' ''''"-'}(^j b„)e{Q C-i}" polarized between noiseless chan- 
nels and pure noisy channels for sufficiently large n. The first part of Theorem l2.5l is proven by the martingale 
convergence theorem without using the assumption that G is not diagonal. 

Lemma 2.6. lim„^<x,/„ exists almost surely. 

Proof. Let t/g^' and Fq^' denote random variables taking values on and y^, respectively, and 
obeying the distribution 

P{U^-^ = ui-\Yr -yi-') - IjWU-^ I ul^G). 



From the chain rule for mutual information, shown in Proposition ll.2l one obtains 



=0 1=0 1=0 



Hence, /„ is a bounded martingale. From the martingale convergence theorem, lim„^oo/„ exists almost 
surely |5|. □ 



PROOF OF Theorem 12. 5 1 Let k denote the largest number where Hamming weight of ^-th row of G is 
larger than 1. Hence, 

I «,) = ^ n ^(3'.' I ^i) n ^(3'; I ^k + x^) n l O) + I 1)) 

jGSo jGSi j=k+\ 

where Sq:= {i e {0,. . . ,k}\ Gki = 0}, 5i := {/ G {0, . . . , fc} | Gfa- = 1 }, and Xj is y-th element of (m^^S 0^"^ )G. 
Let 

W^''Hy,,yk I uk) W{y, \ Uk)W{yk \ Uk) 

where i £Sq. 

From Lemma IZ61 

hm |/(W„+i ) - liWn) I = 0, with probability 1 . 



Hence, 
(2.2) 



lim /(wi*'') -/(W„) = 0, with probability 1. 



Let (D. — Xxy^, 2 , P) denote a probability space where 

P{{i',yuy2))-=\w„{yi\u)Wn{y2\u) 

for {u,yi,y2) G and {U, Fi , denote random variables obeying the distribution P. From (12.21 1. /(Fj , F2; f^) - 
/(Fi ; t/) = I{Y2',U I Fi ) — for all x E X. Since mutual information is lower bounded by cutoff rate as shown 
in Proposition ! 1.31 one obtains 

I{Y2-U\Yi)>-log P{Yi^yi)lY,P{U = u\Yi=yi)^P{Y2=y2\U^u,Yi=yi)\ 

--log £ PiYi^yi)[l-2PiU^0\Yi^yi)PiU^l\Yi^yi)il-Z{W„))] 



1-2 ^ P(Fi -yi) (VPiU = 0\ Fi - 1 I Fi -yi))^: -Z(W„)) 



1-2 P(Fi - yi ) v/^(f/ = I Fi = yi )F(f/ = 1 I Fi = yi ) (1-Z(W„)) 

V.VlGi^,. / 



1 



= - log 
>-log 

= - log 

= - log 
>-log 



The last inequality is obtained from Lemma 12.31 Since the left-hand side of the above inequality tends to 
with probability 1, we conclude e {0,1} with probability 1. Since /„ is a martingale, /oo = 1 with 
probabiHty/(W). □ 



l-2(-Z(W„)) (1-Z(H'„)) 



l--Z(W„)2(l_z(W„)) 
l-i(l-/(W„))2(^l-^l-/(W„): 
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CHAPTER 3 



Speed of Polarization 

3.1. Introduction 

In this chapter, we consider how fast W„ are polarized between noiseless channel and pure noisy channel. 
Instead of I{W„), we evaluate Bhattacharyya parameter Z(W„) which has the relation with I{Wn) as shown in 
Lenima l23] Let Z„ := Z(VK'-^'' ■'^"^). From Theorem 12. 5 1 and Lemma |231 Z„ almost surely where Zoo 

satisfies Zoo = with probability I{W), and Zoo = 1 with probability 1 —I{W). Hence, for any £ G (0, 1) 

limP(Z„ < e) 

Ankan and Telatar showed a stronger result when G is the 2x2 matrix ( 12.1b as follows IS], iS). 
Proposition 3.1. For any p < 1/2, 

limP(Z„ <2-^'^") 

For any j3 > 1 /2, 

limP(Z„<2-2''")=0. 

Korada, §a§oglu and Urbanke generalized the above result to general matrices Q. Further, Tanaka and Mori 
showed a more detailed speed of polarization lITSll . 

3.2. Preliminaries 
Definition 3.2. Partial distance of G is defined as 

min fl'((0|r',0,vf;/)G, (0|r', 1,</)G) 

where d{a,b) denotes the Hamming distance between a G andb G X^, and where OJ)"' denotes the all-zero 
vector of length i. 

Partial distance plays a central role in evaluation of speed of the polarization phenomenon. 
Lemma 3.3. ID 

Z(wf^'^ <Z(H^(')) <2^'-'Z(H^)^'''. 

Definition 3.4. The exponent of a matrix G is defined as E{G) := (1 /£) YiZo log^Z)^. The second exponent 
of a matrix G is defined asV{G):={l j I) iflf} (log^Z)l'l -E(G)f. 

Definition 3.5. The Q function is defined as 

In this chapter, the base of logarithm is assumed to be 2 unless otherwise stated. 
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3.3. Speed of Polarization 

3.3.1. Speed of polarization and random process. The following result is obtained by Arikan and 
Telatar ||3l when G is the 2x2 matrix ( 12. Il l, and by Korada, §a§oglu, and Urbanke for the general case lIH. 

Theorem 3.6. 

]im PiZ„< 2-^"" )^I{W) 

for any j3 < E{G). 

]im P{Z„< 2-'^''") =0 

for any )3 > E{G). 

Tanaka and Mori showed more detailed speed of polarization lITSll . 
Theorem 3.7. For any f{n) — o{^/n), 

limP(^Z„ <2-^^'''"'"'^^^'<"') =I{W)Q{t). 

In order to prove Theorem 13.61 and Theorem 13.71 we consider a generalized process. Let {5„}„gN be 
independent and identically distributed random variables ranging on [1,°°). Assume that the expectation and 
the variance of log^i exist, and are denoted by E[log5i] and V[log5i], respectively. The random process 
{Z„ e (0, 1)}„gn satisfies the following conditions. 

(cl) Z„ Zoo almost surely. 

(c2) There exists a positive constant cq such that cqZ,'^" < Z„+i. 
(c3) There exists a positive constant c\ such that Z„+i < cyZ^" . 
(c4) Sn is independent of Z,„ for m <n. 

In the following proof, the above conditions are used. The random process {Z„}„gN satisfies (c2) and (c3) 
when Sn = Z)[^"l. Then, it holds that E[log5i] = £(G)log£ and that V[log5i] = y(G)(log^)2. Let %;^{y) : = 
{« e n I Zi,{co) < Y,yk e {m,m+l,...,n}} and 7^~(7) •= ^n=iVn{Y)- From (cl), there exist zero sets A 
and B where P{A) P{B) = such that 

{o) e £2 I Zooico) < 7} c 7^°°(7)^ uAc{coeQ.\ Zo„(to) <y}uB 

for any ye [0, 1]. 

3.3.2. Direct part of Theorem I3l6l 

Proposition 3.8. Let {^„}(ign be a random process satisfying (cl) and (c3). For any fixed fi € (0,E[log5'i]) 

lim P (x„ < 2-^'^"^ = P{Xoo = 0). 

Proof. Fix e e (0, 1). We consider a process {L,} defined on the basis of {X,} as 

L,- = loglog(l/X,), / = 0,...,m 

L,+i = log(5/ - e) +L;, / > m. 

Fix ^ > max{l,ci}. Conditional on 7^^!,"^^^' (C"''''^)^ the inequality loglog(l/X„) > L„ holds for any n e 
{m, m+l,...,m + k}. On the other hand, it holds 

m+k— 1 m+k— 1 

Lm+k^L,„+ ^ log(5,-e) >L„+ ^ (log5,- + log(l - e)). 

i=m i=m 

Conditional on C,^+'^-i {(1 //t) 1™+*^" ' log 5; > E[log5i] - e}, it holds 

Lm+k > ^(]E[log5i] - e + log(l - e)) +L„^. 



Hence, 

P(loglog(l/X,„+,) > ^(E[log5i] - e + log(l - e)) +L,„) > P [U'^'-' {l^-"')nC':+'-' 

> 1-P 



From the law of large numbers, it holds lim^-^oo P (C^;^^ '"^j — 0. Since X„ converges to X„ almost surely, 
lim„,^ooP(7;r(C"''''')) > PO^o. < C"''''')- On the other hand, we observe 



UminfP(loglog(l/X^+i) > ^(E[log5i] - e + log(l - e)) +L„. 



< liminfP -loglog(l/X„) > E[log5i] - 7 



for any 7 > e — log(l — e). Hence, 



liminfP ( iloglog(l/X„) > E[log5i] - 7 ) > P{X^ < C"'/'). 

□ 

3.3.3. Converse part of Theorem I3l6l 
Proposition 3.9. Let {X„}„^fi be a random process satisfying (cl) and (c2). For any fixed ji > E[log5i] 

limpfx„ < 2'^'^"] =0. 



Proof. Fix eg (0, 1). We consider a process {L,} defined on the basis of {X,} as 

L,-loglog(l/X,), / = 0,...,m 
L,+i = log(5/ + e)+Li, i> m. 

Fix e (0,min{co, 1}). Conditional on T^{^^^^), it holds loglog(l /X„) < L„ for any « > m. It holds 

m+k— 1 m+k— 1 

L„,+i=L„,+ log(5; + e) < L,„ + ^ (log5; + e). 

i=m i=m 

For any 7 > 0, 

hmsupP f -loglog(l/X„) > E[log5i] +2e ) 

= limsupP ( ^-loglog(l/X,„+^) > E[log5i] +2e ) 

k-^^ \m + k J 

: limsup |p > Epog^i] +2e p TTIC'/')) < 7 fl TTIC'/') 

; limsupjp i^-^^n+k > E[log5i] +2e^ +p(x,„+i, < 7 p X^iC'^'T') | 
<limsup|p|^--^ |^L„,+^e+ ^ log5, j > E[log5i] +2e j | 



< 



< 



+p(^oo<7n'^r(c'/^ 
p(x~<7n'^r(c'/' 



The last equality is obtained from the law of large numbers. 

lim P (x„ < 7 n r-iC'^'r) = 1 - lim P (x. > 7 U T^iC'^ 



By letting 7 = 11,^1^/2, the right-hand side of the above inequality is equal to zero. □ 

3.3.4. Direct part of Theorem 1221 
Proposition 3.10. Lef {X„}„£pj be a random process satisfying (cl), (c3) and (c4). For any f{n) = o{^/n). 



liminfP (X„ < 2-2«l'".^.l'-V«-/<«) > ^ o)g(,)_ 
Proof. Let L„ logX„. Let 7:= max{2,ci}. One obtains 

(n-l n-l \ /n-1 \ /n-1 \ 

L n < ((n-m)log7+L,„). 

j=m/=7-l-l / \i=m j \i=m j 

Fixj3 e (0,£(G)). Letm:= (logn + loglog7)/j3. Conditioned on X',„(j3) := {co e | X,„(c(j) < 2-^'''"}, 

i«<- ^n 5,^mlog7. 

Let (f ) {!"=:,', log > (« - m)E[log5i] + fVV[log5i](« -m) + fin - m)} where /(^) - 
Conditioned on Pmi^jS) and "H","' (0' it holds 



log(-L„) > logm + loglog7+ («-m)E[log5i] +fv/V[log5i](« -m) + /(n-m). 
Hence, it holds 



P(log(-L„) >logm + loglog7+(«-m )E[log5i]+rv/V[log5i]( n-m)+ f(n-m) 

> p {v,„{i5 ) n m-Ht)) = i^n (j3 )) p (^r ' (f )) • 

The last equality follows from (c4). From Theorem l3.6l it holds lim„,^„oP (!?„,( j3)) = P{Xc<, = 0). From the 
central limit theorem, it holds lim„^ooP (^," ' (0) =6(0- 1^^^' one obtains 

liminfpfloglog(l/X„) >«E[log5i]+fv/V[log5i]« + /(«)) > P(Xoe = 0)2(0- 

for any /(n) = o{y/n). □ 

3.3.5. Converse part of Theorem l3.7[ 
Proposition 3.11. Let {X„}„^fj be a random process satisfying (cl), (c2) and (c4). For any f{n) = o{y/n), 

UmsupP (x„ < 2-2'="-^'"-^/^-^""^ < p(x^ = o)e(r). 
Proof. Let L„ := logX„. Let 7:= min{l,co}. For any m < n, one obtains 

(n-l ,1 \ /n-1 \ /n-1 \ 

L n log ^+ n > n ( (« - log 7+ ^-m) ■ 
j=ni i=j-\-l J \i=m J \i=m J 
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For any 5 E (0, 1], one obtains 

lim sup P (log log( 1 /X„)>E [log + / ^/¥\^ogS^^ +f{n)) 

< UmsupP (loglog(l/X„) > E[logSl]n + t^yY[logSl]n + f{n), X„ < s) 
+ limsupPfloglog(l/X„) > E[logSl]n + t^/Y[logSl]n + f{n), X„ > s) 

< Umsup/'(loglog(l/X„) > E[\ogSi\n + tyjN[\ogSi]n + f{n), X„ < s) 

+ limsupF (xn < |, X„ > 5 

(n-\ > 

<UmsupP £log5i + log(-(n-m)logr-L^) >E[log5i]n + ?VWog^ + /W,^m< 5 

n^-oo \i=m J 

+ p{x^<\. X„>5 

= Q{t)P{X,n < 8)+P (x„ < |, X„ > 5^ . 
The last equality follows from (c4) and the central Umit theorem. One obtains 

Umsupp(loglog(l/X„)>E[log5i]n + fv'V[log5i]n + /(n)) 

< limsup |e(OP(X;„ <5)+p(x^<^,X„,>S 

< Q(t)P(X^ < 5)+P (x^ < |, > 5^ = Q{t)P{Xoo < 5). 

By letting 5 to 0, one obtains the result. □ 
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CHAPTER 4 



Polar Codes and its Construction 



4.1. Introduction 

Polar codes are channel codes based on the channel polarization phenomenon. Polar codes achieve 
symmetric capacity under efficient encoding and decoding algorithms. However, construction of polar codes 
requires high computational cost in the original work (I]. One of the contribution of the thesis is to show for 
symmetric B-DMCs, a construction method with complexity 0{N) where is the blocklength iflll . 

4.2. Preliminaries 

For X E {0, 1 }, represents the bit flipping of x. 

Definition 4.1 (Symmetric B-DMC). A B-DMC W : X y is said to be symmetric if there exists a permu- 
tation n ony such thatW {n{y) \ x) — 'W{y\x) for all y £y. 

Definition 4.2. The error probability of a B-DMC W is defined as 

Pem:=\ i: W{y\Q) + \ £ W{y\l) + \ W{y\Q) 

y:W(y|l)>W(j|0) j:W(3'| l)<W(y|0) y:W(y| l)=W(j|0) 

In order to bound the error probability of polar codes, Bhattacharyya parameter is useful. 
Lemma 4.3. HI 

1 f 1 - J\-z{wf\ < P,{W) < iz(w). 



2V V v/y-»^/-2 

4.3. Polar Codes 

Polar codes are based on channel polarization phenomenon. Fix an £x £ matrix G, C {0, . . . , — 1 } 
and ujr. Variables belonging to ujr and ujrc are called /rozen variables and information variables, respec- 
tively. Let G„ {Ifn-i ® G)Ri!^„{Ii® Gn-i) where (g) denotes the Kronecker product, where R( n is a permuta- 
tion matrix such that (mq, • • • ,M(>n_i = {uo,U(, . . . ,m^„-i ,ui,U(^i, . . . ,m^„-i^j, . . . ,m^_i,M2(?-i, • • • ,Mf«_i), 
where 4 denotes the identity matrix of size k, and where Gi — G. An encoding result of a polar code of length 
£" is represented as Uq ^^G„ where ujrc is constituted by pre-encoding values corresponding to a message. 
Note that G„ — B^^^G®" where B(i „ is the bit-reversal permutation matrix with respect to ^-ary expansion ||2l. 
More precisely, for Xq~^ = "o ^^^f.n, xi is equal to uj where £-ary expansion bi ■ ■ - bn of / is the reverse of 
^-ary expansion b„ - ■ - bi of j. 

We assume successive cancellation (SC) decoder for polar codes. For / e {0, ...,£" — !}, let 

In SC decoding, all variables, which consist of information variables and frozen variables, are decoded se- 
quentially from Mo to U(n_i. The decoding result for m, of SC decoder is 




where is a result of SC decoding for M|f ' . WhenW^!'' {y^'^' ,u'q' \Q) ^W^l'' {y^'^' ,u'q' \ l)foiieI', 
the decoding result is determined as and 1 with probability one half. 
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4.4. Error Probabilities of Polar Codes 

We now consider an expected error probability of polar codes where values of ujr are uniformly chosen 
from {0, Let = {0, 1}^" x y",2",f ) be a probability space where P is 

Let Bi and Ai be 

From the definition, one obviously sees Bi C Ai- An expected error probability of polar codes where values 
of ujr are uniformly chosen from {0, is ^'(Uigj^< Bi). One obtains an upper bound of the expected error 
probabihty as 

(4.1) p ( U ) = L ^(^') < L Pi^i) = L Pe{w!i^) ^ L < ^ z(w(^'>-(''")) 

ViGJ=-'- / i^J^'^ ieT^ ieT^ ieJ^^ i'gJ^^ 

where £-ary expansion of / is {bi---b,j). The last equality is not proven here. If one chooses J^' = {' G 

{0, ... ,r - 1} I Z(W(')) < 2-^''"}, the expected error probability is smaller than £"2 . From Theorem l3.6l 

is close to /(W) as n — > oo for any j3 G (0,£'(G)). Hence, the expected error probability is o(2^'^^") 
for any /3 G (0,£'(G)) while coding rate is fixed and smaller than/(W). On the other hand, one obtains 

p{^\J B^j > maxP(A) - maxf,(w('">-('"'') > max^ (^l - y^l -Z(W(^i)-(''"))2j . 

Hence, the expected error probability is a)(2^^^") for any j3 > E(G). From Propositions l3.10l and l3.1 II one 
obtains the following result |18|. 

Theorem 4.4. There exists a sequence of polar codes such that coding rate tends to R < I{W) and the error 
probability is 

o 2 



for any f{n) = o(y^). The error probability of any sequence of polar codes where coding rate tends to 
R < I{W) is 



(0 I 2 

for any e > 0. 

We now consider asymptotic expected error probability of polar codes in a restricted class under maxi- 
mum likelihood (ML) decoding. Assume that the weight of /-th row of G is equal to Z)['l. Then, the weight 
of /-th row of G®" is Y["j=iDbj where {bi ■ ■ -b,,) is an £-ary expansion of /. Fraction of rows which satisfy 
T!j=i ^og/j Dtij > nE{G) +t^JV{G)n tends to 2(0 from the central limit theorem. Since the error probability 
of ML decoding is lower bounded by Pe{W)^ where D is the minimum distance of the code, expected error 
probability of polar codes on ML decoding is 0(2^^^''''"^^ {RWv(G)n+E^^ e > 0. 

4.5. Complexities 

4.5.1. Complexity of encoding. Since encoding procedure of polar codes is multiplication of a matrix, 
the complexity of encoding is 0{£^"). Further, since the matrix G®" has recursive structure, the complexity 
is reduced like the fast Fourier transform. Let c denote the complexity of evaluation of Wq^^G. Let d denote 
the complexity of evaluation of Wq ^^Re^„ divided by Let Xsin) denote the complexity of evaluation of 
Uq^^G„. Since G„ = (/^„-i (g) G)Ri^n{k ® one obtains ^^(n) = P'^^c + td + lXE{n - 1). Hence, 

XE{n)=0{nn. 
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Figure 1 . Left: Factor graph of G3. Right: Decoding graph of M3. 



4.5.2. Complexity of decoding. SC decoding can be described as 

{Ui, if/eJ^ 
0, 

where 



if/^ ^"(y, 



):=log 







)<0 



1 

, Ma 



0) 



1) 



is the log HkeHhood ratio (LLR) of m,. Let Q{Iq ') := ' for ' e where r,- denotes the LLR of m, 

Let Xoin) denote the number of evaluation of Q in calculation 



given Mq ' when an LLR of Mq 'G is 



of {L^^}ie{o,....e'<-i}- Since f/f',„+,' 
Hence, 



■ 



G(t/^^'+^"') Fq^""' for all / e {0, ... ,^ - 1}, it holds that Xoin) 
0{ni"). 



4.5.3. Complexity of construction. Construction of a polar code is equivalent to selection of a set T of 
frozen variables. In |2|, Arikan proposed a criterion on which / with small Z(wi'^ ) are chosen as information 
variables in order to minimize the upper bound ( 14. 11 1. However, unless W is the binary erasure channel 
(BEC), the complexity of the evaluation of Z(wi'^ ) is exponential in the blocklength. In order to avoid the 
high cost of computation, he also proposed a Monte-Carlo method which estimates Z(wi'^) by numerical 
simulations. Arikan also proposed a heuristic method in which a B-DMC W is regarded as the BEC of 
erasure probability 1 — /(W) LU . However, polar codes constructed by these methods do not provably achieve 
symmetric capacity. In this chapter, we describe a novel construction method for any symmetric B-DMC 
whose complexity is linear in the blocklength lll2l . Ill3l . Polar codes constructed by the method provably 
achieve symmetric capacity. The method is based on density evolution, which has been used for evaluation 
of the large blocklength limit of the bit error probability of LDPC codes. 

4.6. Factor Graphs, Belief Propagation and Density Evolution 

Factor graphs, belief propagation (BP), and density evolution are important tools used in certain areas. 
The book of Richardson and Urbanke is a good reference iflSl . A factor graph is a graph which represents 
a probability distribution. The left panel of Figure [T] shows the factor graph of BG®^* when G is the 2x2 
matrix ( 12.11 ). Belief propagation is an efficient algorithm for calculation of marginal probability distributions 
on a tree factor graph. SC decoding can be regarded as BP decoding on a tree graph as in the right panel of 
Figure [U 

Density evolution is a method which recursively evaluates probability distributions of messages on a tree 
graph. Let W be a symmetric B-DMC. There exists a probability density function aw on (— oo,+oo] of an 
LLR when is transmitted, which is linear combination of the Dirac delta function. When W is the BEC 
of erasure probability e, aiy = (1 — £)5oo + where 5^ is the Dirac delta function centered at x. When 
probability density functions of input messages of variable nodes (respectively check nodes) are a and b, the 
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probability density function of the output message is denoted by a ® b (respectively a B\ b). Details of density 
evolution is written in lUSl . 

4.7. Construction using Density Evolution 

In this section, for simplicity, we assume that G is the 2x2 matrix ( 12.11 ). We consider using density 
evolution for evaluation of ) for / e {0, 1}. In fact, we can evaluate the probability density 

function of an LLR of wj'^ by density evolution lITll . 

Theorem 4.5. For n> 1, 

^elwi'^ ) is obtained by an appropriate integration of a^(,) . 

Let us consider the number Xc{n) of operations ® and ffl in the calculation of {a (,) },=o 2"- 1 ■ In order 

to calculate {a^j,)},=o 2"-b calculation of {3^{i} };=o 2"-i-i required. Further, 2" operations of ® and 
S are necessary. Hence, 

;fc(«)=2" + Zc(n-l). 

This implies Xc{n) — 0(2") meaning that it is proportional to the blocklength. It is known that the complexity 
of selection of the s smallest elements from a set of size t is (9(f). Hence, the complexity of construction is 
linear in the blocklength if we assume that the complexity of the operations ® and [1 is constant. However, 
the required precision increases as the blocklength increases. When W is the binary symmetric channel 
(BSC), the number of mass points grows exponentially in the blocklength. It has not been well known how 
quantization errors affect performance of resulting codes. 

4.8. Numerical Calculation and Simulation 

In this section, error probability of polar codes constructed by using density evolution and error prob- 
ability of polar codes constructed by Arikan's heuristic method in which W is regarded as BEC of the 
same capacity are compared. Figure |2] shows results for the BSC with crossover probability 0.11 and the 
blocklength is 4096. The capacity of the BSC is 0.5. The error probabilities of polar codes which are con- 
structed by using density evolution are much smaller than the error probabilities of polar codes which are 
constructed by the heuristic method. This result implies that information variables should be chosen by tak- 
ing into account the channel, rather than its capacity only. This can easily be confirmed via the simplest case 
with n = 2: The error probability Pe{W2^'^) is less than, equal to, and larger than Pe{W2^'^) when the channel 
is the BEC, BSC, and binary additive white Gaussian noise channel (BAWGNC), respectively, irrespective 
of the channel parameters. In ||7l, jSl, the authors show that polar codes and SC decoding do not achieve 
symmetric capacity universally. 
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optimized for the BSC 
optimized for the BEG 



0.3 



0.35 



0.4 

Coding rate 



0.45 



0.5 



Figure 2. Comparison of the error probability of polar codes constructed by different 
methods. The bottom curve is the result of construction using density evolution. The top 
curve is the result of construction using the heuristic method of Arikan 1 1 1. The channel is 
the BSC of crossover probability 0.1 1. The capacity is 0.5. The blocklength is 4096. 
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CHAPTER 5 



Channel Polarization of ^-ary DMC by Arbitrary Kernel 



5.1. Introduction 

§a§oglu, Telatar and Arikan considered channel polarization of ^-ary channels fTS^. They regarded X 
as Z/qZ and assumed that the size £ of channel transform is 2. They showed that the channel polarization 
phenomenon occurs on the 2x2 matrix (12.1b when q is prime, and that using randomized permutations, the 
channel polarization phenomenon occurs for any q. In this chapter, we consider channel polarization for 
arbitrary q and arbitrary channel transform llT4l . 

5.2. Preliminaries 

In this chapter, we assume that | A"! = q' and that the base of logarithm is q unless otherwise stated. Let e 
denote the base of the natural logarithm. 

Definition 5.1. The symmetric capacity of a q-ary input channel W \ X is defined as 



Note thatI{W) E [0,1]. 

Definition 5.2. Let V^, :^ {y e y \ W{y \ x) > W{y \ jt:'),Vjt:' e X,x' ^ x}. The error probability ofW is 
defined as 



Definition 5.3. The Bhattacharyya parameter ofW is defined as 

1 

q{q- 

where Bhattacharyya parameter between x andx' is defined as 



'i^'i '■'xex/ex, 



:= E Vwiy\x)Wiy\x'). 



Note thatZ{W) G [0, 1] and thatZ^j{W) G [0, 1]. 
Lemma 5.4. For any x<E X, x' £ X and x" e X, 



Proof. The inequality follows from the triangle inequality of Euclidean distance since 



Lemma 5.5. 

PeiW)<{q~l)Z{W) 
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□ 



Lemma 5.6. Ill6l 

/(W)>log- 



l)Z(W) 



/(W) <log(<?/2) + (log2)./l-Z(W)2 



/(W)<2(^-l)(loge)^l-Z(W)2. 

Definition 5.7. The maximum and the minimum of the Bhattacharyya parameters between two alphabets are 
defined as 

Zmax(W^) := max Z^j{W) 

Zn,in(W):= min Z,j{W). 

xex.x'ex 

Let O : X ^ X be a permutation. Let o' denote the i-th power of O. The average Bhattacharyya parameter 
ofW between x and x' with respect to <J is defined as the average of Z.^^i{W) over the subset {{z,z') = 
(cj' W,CJ'(x')) e 1 / = 0, 1, . . . - 1} 

I </!-! 

Zx.x'iW)-=-jL2a-ixU-ix')iW). 
^- 1=0 

5.3. Channel Polarization on q-ary Channels 

We consider channel transform using a one-to-one onto mapping g : X^ X^, called a kernel. 

Definition 5.8. LetD-MCW : X ^y. ThenD-MCW': X'' ^y'^,W'''^ : X x X'-\ andw'-'li -X^y^ 

"o 

are defined as 

w'{yt'\4-'):^UWiy,\x,) 

1=0 



W^Hyt'A-' I «<) := ^ L W'i/o-' I 8(4-')) 



Assume that {B, },eN is a sequence of independent uniform random variables taking values on {0, . . . , £ — 
1}. In the probabilistic channel transform W W^^'\ expectation of the symmetric capacity is invariant 
due to the chain rule for mutual information. The following lemma is a consequence of the martingale 
convergence theorem ||5). 

Lemma 5.9. There exists a random variable loo such that converges to loo almost surely as 

n — > °°. 



From Lemma 15761 liW) is close to and 1 when Z{W) is close to 1 and 0, respectively. In order to 
show channel polarization, i.e., e {0, 1} with probability 1, it suffices to show lim„^ooP(Z(w'^'-' ■'■^"') G 
(5, 1 — 5)) = for any 5 e (0, 1 /2). The following lemma is useful for this purpose. 

Lemma 5.10. Let {3^}nGN be a random process taking values on a discrete set. Let {W„ : X — > 3^„}„gn be 
a random process taking values on q-ary DMC. Let O and X be permutations on X. Let 

Wliyuyi I X) = -Wn{yi I cy{x))Wn{y2 \ t{x)). 
1 

Assume 

Iim|/(1V„')-/(W„)|=0 

with probability L Then lim„^oc /"(Z^^r' (VK„) G (5, 1 - 5)) = O for any x £ X, x' e X and 5 € (0, 1 /2). 

18 



Proof. Let Z, Yi and Y2 be random variables which take values on X, y„ and y„, respectively, and 
jointly obey the distribution 

Pn{Z = z,Yi =yi,Y2^y2) = -W„(yi I a{z))Wn{y2 \ 

Since /(W„') =/(Z;Fi,F2) and I{W„) ^I{Z;Yi), I{Z;Yi,Y2) - I{Z;Y{) = I{Z;Y2 \ Fi) tends to with proba- 
bility 1 by the assumption. Since the mutual information is lower bounded by the cutoff rate as shown in 
Proposition ! 1.31 one obtains 



I{Z-Y2\Yi)>-\og Y Pn{Yi=yi) 



YPn{Z = z\Yi ^yi)VP„{Y2=y2\Z^z,Yi^yi) 

zex 

= -log Y PniYi=yMZ = z\Yi^y,)P„{Z^x\Yi^yi)Z,^^^^,^,){W„) 

y[ey„,zex,xex 

= -log Y ?n(3'b^>-^K(a-l(z)),T(a-l(x))(^") 

yiey„,zex,xex 

where 

qn{yuz,x) -.^PniYi ^yi)P„{Z^a-\z) I Yi =yi)P„{Z = o-\x) I Fi 

Since 



Y q.{yuz,x)^ Y. Pn{Yi^yi){^P„iZ^a-^{z)\Yi^y,)P„iZ^a-^{x)\Yi^yi) 

yiey yiey 



> L PniYi^yi)Jp„iZ^a-'{z)\Yi=yi)P„iZ^a-'{x)\Yi=yi) 
\yiey 



it holds 

I{Z;Y2\Yi)>~log 



^ zex,xex 



The convergence of I{Z;Y2 | Fi) to with probability 1 implies that 

Z,,.(W„)2(l-Z,(,-,(^,))_,(,-,(,„(W„)) 

tends to with probability 1 for any (z,x) e X^. It consequently implies lim„^„o-P(ZJ^ ' (W„) G (5,1 — 5)) = 
0forany(z,x)eA'2and5e(0,l/2). □ 

Corollary 5.11. Assume that there exists Mq^^ G X^^^, {i,j) G {0, !,...,£— 1}^ and permutations <T and T 
on X such that i-th element o/g(MQ^') and j-th element of g^u^^^) are (7(m£_i) and X^ua^i), respectively, 
and such that for any Vq^^ ^ Mq^^ G A"*^^ ^ there exists m G {0, 1 , . . . , ^ — 1 } and a permutation pi on X such 
that m-th element of g{v1^^) is ;U(v£_i). Then, Um„^ooP(Z^J ' (W„) G (5, 1 — 8)) = 0/or all x e X, x' £ X 
and 5 e (0,1/2). 

Proof. Since /(W'(^i) -(^'')) converges to with probability 1, |/(H'(^i)- (^«)(''-i)) -/(w'^i) -*^"))! has 
to converge to with probability 1. Let Uq^^ and Yq^^ denote random variables ranging over X^' and y^, and 
obeying the distribution 

p{Uq =uq ,Yq =3/0 ) = --w^ '[yo ,"0 
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Then, it holds 

/(ly(^-l)) = /(y„^-^^/,^-2;t/,_o 

From the assumption, /(^(f" ' ; t/£_ i | U^-^^ = 4"^) > I{W) for all m^"^ ^ ;^'£-i^ Hence, |/(H^(^i) -(«'>)') - 
/(^(^i) (fin))| has to converge to with probability 1. By applying Lemma IS. 101 one obtains the result. □ 

When q — 2, Corollary 15. lll is sufficient to show the channel polarization phenomenon. The derivation 
does not use linearity of a kernel. When we assume that A" is a finite field and that a kernel g is linear, the 
matrix G representing the kernel g is assumed to be lower triangular due to the same reason as in Chapter|2] 

Theorem 5.12. Assume that X is a prime field, and that a linear kernel G is not diagonal. Then, P{Ioo G 
{0,1}) = 1. 

Proof. Let k be the largest number such that the number of non-zero elements in ^-th row of G is larger 
than 1 . Without loss of generality, we assume Gh = L It holds 

^ j=k+i \xex J jeso jesi 

where So := {j e {0, ...,£- 1} \ Gkj = 0}, Si := {j e {0, 1} \ Gkj ^ 0}, and xj is ;-th element of 
(mq^ ' , 0^^ ' )G where 0^^ ' is all-zero vector of length £ — k. Let m € {0, . . . , ^ — 1 } be an arbitrary index such 
that Gkm 7^ 0. Since each Mq^' occurs with positive probability 1 /q^, we can apply Lemma lS.lOl with (7(x) —x 
and 'z{x) = Gkm^-^z for an arbitrary z<E X. Hence, for sufficiently large n, Z^^j(w''^'' ■'•^"') is close to 
or 1 almost surely where t(x) = G\^^^x + z for all / e {0, . . . , ^ — 2} and all z e A". Since q is prime, for any 
X e A" andx' e A" wherex 7^ jc', z:j^^, is close to 1 if and only if Z(W(^i) •■■(^")) is close to 1, where 

x{z)=z + x'-x. □ 

This result is a simple generalization of the special case considered by §a§oglu, Telatar and Arikan lll6l . 
We also show another sufficient condition for channel polarization in the following corollary. 

Corollary 5.13. Assume that X is a field and that a linear kernel G is not diagonal. Let k be the largest 
number such that the number of non-zero elements in k-th row of G is larger than 1. If there exists j g 
{0, . . . ,k — 1} such that Gkj/Gkk is a primitive element, it holds P{Ioc G {0, 1}) = 1. 

Proof. By applying LemmaEJO] one sees that lim„^ocP(Zf^,(W(^i'- (^")) G (5, 1 - 5)) = for all 
jc G A", y G A" and 5 G (0, 1 /2), where o{x) = {Gkj / Gkk)x + z for an arbitrary z G A. It suffices to show that 
foranyxG Aandx' G A", x 7^ jc', Z,y (H^^^i) ■(^'■)) is close to 1 if and only if Z(W(^i)-(^")) is close to L When 
Zv,y is close to 1, Zo,(G,^./G,,)(y-.)(lV(«')-(^")) is close to L Hence, Zo,(g,^./g«)-(^-.)(^''''''"*''"^) 

is close to 1 for any / G {0, ...,q — 2}. Since Gkj/Gkk is a primitive element, Zoj(W'^^'''''^"') is close to 1 
for any x G A". From Lemma 15741 it completes the proof. □ 

5.4. Speed of Polarization 

The result in Chapter|3]is also applicable to non-binary channel polarization. 
Definition 5.14. Partial distance of a kernel g : A^ — >■ X^ is defined as 

D%{u',-'):^ min d{g{u',-\x,v^ll g{u',-\^ 

where d{a,b) denotes the Hamming distance between a G X^ and b G X^. 
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We also use the following quantities. 

d''', :=minD['l,(Miri), i := max D^'',, D^'l := min D^^ ,. 

x.x' x,x'y ^ max x.^' mm ^^^y^^ V 



When g is linear, (mq ^) does not depend on x, x' or Mq \ in which case we will use the notation dW 

instead of D^^^{u'q^). For a full-rank square matrix G, E{G) and V{G) are defined in the same way as in 
Definition [33] 

In order to apply the method in Chapter|3] the following lemma similar to Lemma |33] is used. 
Lemma 5.15. 

Proof. Proof of the second inequality is almost the same as the proof in |9 |. 



yo 



= q' ^Wi'){yl\u;-' \x)Wi'){yi-\u;-' \x') 

= toL / L W^{yi-'\g{u',\xy-l))W^{y^-^\giu'^-\x'M-A^ 



>/-l ,,.'-1 

< ■ £ z„„(«-)''S<.i-. 



= ^^-'-'Zn,ax(W) 

The first inequality is obtained as follows 



"O f_i V "o 



>0 



q' £ |x)W(')(3;^-l,«i,-l 1^) 



> 



1 D<'' f«'-h 

- ^2(«-l-i) 



Corollary 5.16. For / e {0, 1 }, 



□ 
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From Proposition 13 . 1 01 13 . 1 1 1 and Corollary 15. 161 the following theorems are obtained. 

Theorem 5.17. Assume P{Iao{W) G {0, 1}) ~ 1. Let f{n) be an arbitrary function satisfying f{n) — o{^/n). 
It holds 



liminfP Z{W 



(Bi)...{B„] 



)<2 



_^£,(s)H+r/vni>?+/(«) 



>I{W)Q{t) 



where Elig) = {l/£)LAogeD^^, andwhere Vi{g) = {l/i)U^ogiD^^^- Ei{g))^. 
When Z„,in{W) > 0, 



limsupP Z(W' 



<2 



where E2{g) = (l/€)Llog^£'ljjax and where V2{g) = il/£)^i{log(:DZx- Eiig))^. 

Theorem 5.18. Assume that g is a linear kernel represented by a matrix G and that f (/oo(W) G {0, 1}) = 1. 
Let f{n) be an arbitrary function satisfying f{n) — o{y/n). It holds 



liminfP Z(w(^i) -(^")) <2-^" 



fiE{G)n+t^V{G),i+f[n 



>IiW)Qit) 



When Z^iniW) > 0, 



limsupP ( Z(1V(^'^ -^^")) <2 



_fE(G)n+t^ViG)n+fin) 



<I{W)Q{t). 

5.5. Reed-Solomon kernel 

Assume that A" is a field and that a G A" is its primitive element. For a non-zero element 7 G A", let 



G = 



1 

a{'?-2)(9-2) 

^(^-2)(g-3) 
1 



1 

^(^-3)(^-3) 

1 



1 1 0' 

a''-^ 1 
ai-^ 1 



a 
1 



1 
1 7. 



When q is prime, channel polarization phenomenon occurs for any 7 7^ 0. When 7 is a primitive element 
of X, channel polarization phenomenon occurs for any field X. We call G a Reed-Solomon kernel since its 
submatrix which consists of ;-th row to [q — l)-th row is a generator matrix of a generalized Reed-Solomon 
code for any / G {0, . . . , <7 — 1 } Iflll . Since generalized Reed-Solomon codes are maximum distance separable 
(MDS) codes, itholdsD^ =/+l. Hence, the exponent of Reed-Solomon kernel is {\ / £)\og('{£\) where£ = ^. 
Since 



-£log,(/+l)> 



1 



i=0 



£l0g,^7l 



log„ xAx = 1 



£-1 

nog J 



the exponent of the Reed-Solomon kernel tends to 1 as ^ = tends to infinity. The exponent of the Reed- 
Solomon kernel of size 2^ is log24/(41og4) « 0.57312. In ||9l, the authors showed that, by using large 
kernels, the exponent can be improved, and found the best matrix of size 1 6 whose exponent is about 0.51828. 
The exponent of the Reed-Solomon kernel on F4 of size 4 is larger than the largest exponent of binary matrices 
of size 16. 

The Reed-Solomon kernel can be regarded as a natural generalization of the 2x2 matrix (12. 11 1. Note that 
a generator matrix of the r-th order ^-ary Reed-MuUer code of length q" is constructed by choosing rows 

|./G{0,...,^«-1} I f^h{j)>{q-l)n-r^ 

from G®" where bi{i) is the /-th element of ^-ary expansion of j. The relation between binary polar codes 
and binary Reed-Muller codes was mentioned by Arikan ll), |[1]. 
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Summary 



In the thesis, we have seen the channel polarization phenomenon and polar codes. It is shown that 
polar codes are constructed with linear complexity in the blocklength for symmetric B-DMC. The channel 
polarization phenomenon on ^-ary channels has also been considered. We see sufficient conditions of kernels 
on which the channel polarization phenomenon occurs. We also see that the Reed-Solomon kernel is a 
natural generalization to ^-ary alphabet of the 2x2 matrix ( 12.11 ) as a binary matrix. The exponent of the 
Reed-Solomon kernel tends to 1 as <7 tends to infinity. The exponent of the Reed-Solomon kernel of size 2^ 
is larger than the largest exponent for binary matrices of size 16. 
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